Evidence for the Pareto principle in Open Source Software Activity

نویسندگان

  • Mathieu Goeminne
  • Tom Mens
چکیده

Numerous empirical studies analyse evolving open source software (OSS) projects, and try to estimate the activity and effort in these projects. Most of these studies, however, only focus on a limited set of artefacts, being source code and defect data. In our research, we extend the analysis by also taking into account mailing list information. The main goal of this article is to find evidence for the Pareto principle in this context, by studying how the activity of developers and users involved in OSS projects is distributed: it appears that most of the activity is carried out by a small group of people. Following the GQM paradigm, we provide evidence for this principle. We selected a range of metrics used in economy to measure inequality in distribution of wealth, and adapted these metrics to assess how OSS project activity is distributed. Regardless of whether we analyse version repositories, bug trackers, or mailing lists, and for all three projects we studied, it turns out that the distribution of activity is highly imbalanced.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Vital Few and Trivial Many: An Empirical Analysis of the Pareto Distribution of Defects

The Pareto Principle is a universal principle of the “vital few and trivial many”. According to this principle, the 80/20 rule has been formulated with the following meaning: For many phenomena, 80% of the consequences originate from 20% of the causes. In this paper, we applied the Pareto Principle to software testing and analysed 9 open source projects (OSPs) across several releases. The resul...

متن کامل

On the probability distribution of faults in complex software systems

Context. There are several empirical principles related to the distribution of faults in a software system (e.g. the Pareto principle) widely applied in practice and thoroughly studied in the software engineering research providing evidence in their favor. However, the knowledge of the underlying probability distribution of faults, that would enable a systematic approach and refinement of these...

متن کامل

An empirical study of package coupling in Java open-source

Excessive coupling between object-oriented classes in systems is generally acknowledged as harmful and is recognised as a maintenance problem that can result in a higher propensity for faults in systems and a „stored up‟ future problem. Characterisation and understanding coupling at different levels of abstraction is therefore important for both the project manager and developer both of whom ha...

متن کامل

Minimizing the total tardiness and makespan in an open shop scheduling problem with sequence-dependent setup times

We consider an open shop scheduling problem with setup and processing times separately such that not only the setup times are dependent on the machines, but also they are dependent on the sequence of jobs that should be processed on a machine. A novel bi-objective mathematical programming is designed in order to minimize the total tardiness and the makespan. Among several mult...

متن کامل

Salt: Combining ACID and BASE in a Distributed Database

This paper presents Salt, a distributed database that allows developers to improve the performance and scalability of their ACID applications through the incremental adoption of the BASE approach. Salt’s motivation is rooted in the Pareto principle: for many applications, the transactions that actually test the performance limits of ACID are few. To leverage this insight, Salt introduces BASE t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011